# Seurat
library(Seurat)
library(SeuratDisk)
# Single R
library(SingleCellExperiment)
library(SingleR)
library(celldex)
# Data
library(dplyr)
# Plotting
library(ggplot2)
library(RColorBrewer)
library(patchwork)
Attaching SeuratObject
Registered S3 method overwritten by 'cli':
method from
print.boxx spatstat.geom
Registered S3 method overwritten by 'SeuratDisk':
method from
as.sparse.H5Group Seurat
Loading required package: SummarizedExperiment
Loading required package: MatrixGenerics
Loading required package: matrixStats
Attaching package: ‘MatrixGenerics’
The following objects are masked from ‘package:matrixStats’:
colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
colWeightedMeans, colWeightedMedians, colWeightedSds,
colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
rowWeightedSds, rowWeightedVars
Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:parallel’:
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from ‘package:stats’:
IQR, mad, sd, var, xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, append, as.data.frame, basename, cbind, colnames,
dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
union, unique, unsplit, which.max, which.min
Loading required package: S4Vectors
Attaching package: ‘S4Vectors’
The following objects are masked from ‘package:base’:
expand.grid, I, unname
Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Attaching package: ‘Biobase’
The following object is masked from ‘package:MatrixGenerics’:
rowMedians
The following objects are masked from ‘package:matrixStats’:
anyMissing, rowMedians
Attaching package: ‘SummarizedExperiment’
The following object is masked from ‘package:SeuratObject’:
Assays
The following object is masked from ‘package:Seurat’:
Assays
Attaching package: ‘celldex’
The following objects are masked from ‘package:SingleR’:
BlueprintEncodeData, DatabaseImmuneCellExpressionData,
HumanPrimaryCellAtlasData, ImmGenData, MonacoImmuneData,
MouseRNAseqData, NovershternHematopoieticData
Attaching package: ‘dplyr’
The following object is masked from ‘package:Biobase’:
combine
The following objects are masked from ‘package:GenomicRanges’:
intersect, setdiff, union
The following object is masked from ‘package:GenomeInfoDb’:
intersect
The following objects are masked from ‘package:IRanges’:
collapse, desc, intersect, setdiff, slice, union
The following objects are masked from ‘package:S4Vectors’:
first, intersect, rename, setdiff, setequal, union
The following objects are masked from ‘package:BiocGenerics’:
combine, intersect, setdiff, union
The following object is masked from ‘package:matrixStats’:
count
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
# Set working directory to project root
setwd(gsub("/script/seurat", "", getwd()))
# Source files
source("plotting_global.R")
# Filtering Parameter
nFeature_RNA_min_m <- 250
nFeature_RNA_min_p <- 500
nFeature_RNA_max_m <- 4000
nFeature_RNA_max_p <- 4000
nCount_RNA_min_m <- 1000
nCount_RNA_min_p <- 1500
nCount_RNA_max_m <- 20000
nCount_RNA_max_p <- 20000
pMt_RNA_max_m <- 5
pMt_RNA_max_p <- 5
# Files
so_raw_file <- "data/seurat_object/so_raw.rds"
so_qc_file <- "data/seurat_object/so_qc.rds"
loom_qc_file <- "data/loom_object/loom_qc"
# Plotting Theme
ggplot2::theme_set(theme_global_set()) # From project global source()
so_raw <- readRDS(so_raw_file)
so_raw$nFeature_RNA_max <- ifelse(so_raw$tissue == "Myeloid", nFeature_RNA_max_m, nFeature_RNA_max_p)
so_raw$nFeature_RNA_min <- ifelse(so_raw$tissue == "Myeloid", nFeature_RNA_min_m, nFeature_RNA_min_p)
so_raw$nCount_RNA_max <- ifelse(so_raw$tissue == "Myeloid", nCount_RNA_max_m, nCount_RNA_max_p)
so_raw$nCount_RNA_min <- ifelse(so_raw$tissue == "Myeloid", nCount_RNA_min_m, nCount_RNA_min_p)
so_raw$pMt_RNA_max <- ifelse(so_raw$tissue == "Myeloid", pMt_RNA_max_m, pMt_RNA_max_p)
# QC
so_raw$qc_class <- ifelse(
so_raw$cellranger_class == "Cell" &
so_raw$nFeature_RNA <= so_raw$nFeature_RNA_max &
so_raw$nFeature_RNA >= so_raw$nFeature_RNA_min &
so_raw$nCount_RNA <= so_raw$nCount_RNA_max &
so_raw$nCount_RNA >= so_raw$nCount_RNA_min &
so_raw$pMt_RNA <= so_raw$pMt_RNA_max,
"pass", "fail"
)
Empty droplets were determined with CellRanger V3.0.2 Lun et al., 2019 EmptyDrop heuristic. RNAse activity of granulocytes might be wrongly identified as empty cells by CellRanger.
Typical Sample A steep drop-off is indicative of good separation between the cell-associated barcodes and the barcodes associated with empty GEMs. A ideal barcode rank plot has a distincitve shape, which is referred to as a "cliff and knee".
Heterogeneous Sample Heterogeneous populations of cells in a sample result in two "cliff and knee" distributions. However, there should still be clear separation between the bacodes.
Compromised Sample Round curve and lack of steep cliff may indicate low sample quality or loss of single-cell behavior. This can be due to a wetting failure, premature cell lysis, or low cell viability.
Compromised Sample Defined cliff and knee, but the total number of barcodes detected may be lower than expected. This can be caused by a sample clog or inaccurate cell count.
rank_plot <- ggplot(so_raw@meta.data, aes(x = log10(nCount_RNA_rank), y = log10(nCount_RNA), color = cellranger_class)) +
geom_point() +
geom_hline(aes(yintercept = log10(nCount_RNA_min)), color = "red", linetype = "longdash") +
scale_color_manual(values = so_color$cellranger_class) +
ggtitle("Barcode rank plot") +
xlab("log10(cell barcode rank)") + ylab("log10(cell UMI counts)") +
facet_grid(tissue~treatment) +
theme(aspect.ratio = 1, legend.position = "bottom")
options(repr.plot.width = 5, repr.plot.height = 5)
rank_plot
ggsave(rank_plot, filename = "result/plot/seurat/rank_plot.png", width = 4, height = 4)
so_qc <- subset(so_raw, subset = cellranger_class == "Cell")
qc_1 <- ggplot(so_qc@meta.data, aes(x = log10(nCount_RNA), fill = tissue)) +
geom_density() +
ggtitle("Density plot UMI count") + xlab("log10(UMI count)") + ylab("Density") +
geom_vline(aes(xintercept = log10(nCount_RNA_min)), color = "red", linetype = "longdash") +
scale_x_continuous(breaks = integer_breaks()) +
scale_fill_manual(values = so_color$tissue) +
facet_grid(tissue~treatment) +
theme(legend.position = "bottom", aspect.ratio = 1)
qc_2 <- ggplot(so_qc@meta.data, aes(x = log10(nFeature_RNA), fill = tissue)) +
geom_density() +
ggtitle("Density plot Feature count") + xlab("log10(Feature count)") + ylab("Density") +
geom_vline(aes(xintercept = log10(nFeature_RNA_min)), color = "red", linetype = "longdash") +
scale_x_continuous(breaks = integer_breaks()) +
scale_fill_manual(values = so_color$tissue) +
facet_grid(tissue~treatment) +
theme(legend.position = "bottom", aspect.ratio = 1)
qc_3 <- ggplot(so_qc@meta.data, aes(x = pMt_RNA, fill = tissue)) +
geom_density() +
ggtitle("Density plot Mt %") + xlab("Mt [%]") + ylab("Density") +
geom_vline(aes(xintercept = pMt_RNA_max), color = "red", linetype = "longdash") +
scale_x_continuous(breaks = integer_breaks()) +
xlim(0, 20) +
scale_fill_manual(values = so_color$tissue) +
facet_grid(tissue~treatment) +
theme(legend.position = "bottom", aspect.ratio = 1)
options(repr.plot.width = 15, repr.plot.height = 5)
qc_1 + qc_2 + qc_3 + plot_layout(guides = "collect") & theme(legend.position = "bottom")
ggsave(qc_1, filename = "result/plot/seurat/density_umi.png", width = 4, height = 4)
ggsave(qc_2, filename = "result/plot/seurat/density_feature.png", width = 4, height = 4)
ggsave(qc_3, filename = "result/plot/seurat/density_mt.png", width = 4, height = 4)
Scale for 'x' is already present. Adding another scale for 'x', which will replace the existing scale. Warning message: “Removed 3813 rows containing non-finite values (stat_density).” Warning message: “Removed 3813 rows containing non-finite values (stat_density).”
sc_1 <- ggplot(so_qc@meta.data, aes(x = log10(nCount_RNA), y = log10(nFeature_RNA), color = pMt_RNA)) +
geom_point() + ggtitle("Mitochondrial gene percentage") + ylab("log10(feature count)") + xlab("log10(umi count)") +
geom_vline(aes(xintercept = log10(nCount_RNA_min)), color = "red", linetype = "longdash") +
#geom_vline(aes(xintercept = log10(nCount_RNA_max)), color = "red", linetype = "longdash") +
geom_hline(aes(yintercept = log10(nFeature_RNA_min)), color = "red", linetype = "longdash") +
geom_hline(aes(yintercept = log10(nFeature_RNA_max)), color = "red", linetype = "longdash") +
facet_grid(tissue~treatment) + theme(aspect.ratio = 1, legend.position = "bottom") +
scale_size(guide = guide_legend(direction = "vertical"))
sc_2 <- ggplot(so_qc@meta.data, aes(x = log10(nCount_RNA), y = log10(nFeature_RNA), color = pHb_RNA)) +
geom_point() + ggtitle("Hemoglobin gene percentage") + ylab("log10(feature count)") + xlab("log10(umi count)") +
geom_vline(aes(xintercept = log10(nCount_RNA_min)), color = "red", linetype = "longdash") +
#geom_vline(aes(xintercept = log10(nCount_RNA_max)), color = "red", linetype = "longdash") +
geom_hline(aes(yintercept = log10(nFeature_RNA_min)), color = "red", linetype = "longdash") +
geom_hline(aes(yintercept = log10(nFeature_RNA_max)), color = "red", linetype = "longdash") +
facet_grid(tissue~treatment) + theme(aspect.ratio = 1, legend.position = "bottom") +
scale_size(guide = guide_legend(direction = "vertical"))
sc_3 <- ggplot(so_qc@meta.data, aes(x = log10(nCount_RNA), y = log10(nFeature_RNA), color = pRp_RNA)) +
geom_point() + ggtitle("Ribsonmal gene percentage") + ylab("log10(feature count)") + xlab("log10(umi count)") +
geom_vline(aes(xintercept = log10(nCount_RNA_min)), color = "red", linetype = "longdash") +
#geom_vline(aes(xintercept = log10(nCount_RNA_max)), color = "red", linetype = "longdash") +
geom_hline(aes(yintercept = log10(nFeature_RNA_min)), color = "red", linetype = "longdash") +
geom_hline(aes(yintercept = log10(nFeature_RNA_max)), color = "red", linetype = "longdash") +
facet_grid(tissue~treatment) + theme(aspect.ratio = 1, legend.position = "bottom") +
scale_size(guide = guide_legend(direction = "vertical"))
options(repr.plot.width = 15, repr.plot.height = 5)
sc_1 + sc_2 + sc_3 & theme(legend.position = "bottom")
ggsave(sc_1, filename = "result/plot/seurat/sc_mt.png", width = 4, height = 4)
ggsave(sc_2, filename = "result/plot/seurat/sc_hg.png", width = 4, height = 4)
ggsave(sc_3, filename = "result/plot/seurat/sc_rb.png", width = 4, height = 4)
so_qc <- subset(so_qc, subset = nFeature_RNA >= nFeature_RNA_min & nFeature_RNA <= nFeature_RNA_max & nCount_RNA >= nCount_RNA_min & nCount_RNA <= nCount_RNA_max & pMt_RNA <= pMt_RNA_max)
cc_kowalczyk <- read.csv("cc_kowalczyk.csv")
cc_kowalczyk <- cc_kowalczyk[cc_kowalczyk$sig_population >= 5, ]
cc_kowalczyk_g0 <- cc_kowalczyk[cc_kowalczyk$state == "down", ]
cc_kowalczyk_int <- cc_kowalczyk[cc_kowalczyk$state == "up", ]
cc_kowalczyk_s <- cc_kowalczyk_int[cc_kowalczyk_int$S > cc_kowalczyk_int$G2_M, ]
cc_kowalczyk_g2m <- cc_kowalczyk_int[cc_kowalczyk_int$G2_M > cc_kowalczyk_int$S, ]
so_raw <- CellCycleScoring(so_raw, s.features = cc_kowalczyk_s$gene, g2m.features = cc_kowalczyk_g2m$gene, set.ident = FALSE)
colnames(so_raw@meta.data) <- gsub("Phase", "cc_phase_class", colnames(so_raw@meta.data))
colnames(so_raw@meta.data) <- gsub("S.Score", "msS_RNA", colnames(so_raw@meta.data))
colnames(so_raw@meta.data) <- gsub("G2M.Score", "msInt_RNA", colnames(so_raw@meta.data))
so_raw$msCC_diff_RNA <- so_raw$msS_RNA - so_raw$msInt_RNA
so_raw <- AddModuleScore(so_raw, features = list(cc_kowalczyk_g0$gene), assays = "RNA", slot = "data", ctrl = 100, nbin = 25, name = "msG0_RNA")
so_raw <- AddModuleScore(so_raw, features = list(cc_kowalczyk_int$gene), assays = "RNA", slot = "data", ctrl = 100, nbin = 25, name = "msInt_RNA")
Warning message: “The following features are not present in the object: Ccdc101, Cirh1a, Hn1l, Cct6a, Atp5o, Apitd1, Stra13, Nt5c3l, Cpsf3l, D17Wsu104e, Rqcd1, Lrdd, Supt16h, Slmo2, Trpc2, Mrp63, Ubl4, Sip1, Ankrd32, Tubg2, 11-Sep, Dynlt1-ps1, Fam165b, Tmem48, Mlf1ip, not searching for symbol synonyms” Warning message: “The following features are not present in the object: Hn1, Th1l, SNORD93, Tssc1, D17H6S56E-5, Rad51l1, Sgol1, Casc5, Fam64a, D2Ertd750e, Fam54a, Hist1h4m, not searching for symbol synonyms” Warning message: “The following features are not present in the object: Pisd-ps2, Ces2f, Pi15, Dcdc2c, Prl8a1, Rdh9, Zfp71-rs1, Vmn1r58, Zfp488, Unc13c, Abca12, Gjc3, Olfr613, Gm11033, Gm10125, Mir1895, A730017L22Rik, Gm13589, Gm14411, A530072M11Rik, Kirrel3, Gbp1, Nox4, AC125373.1, Gm10311, Acss3, Pisd-ps1, not searching for symbol synonyms” Warning message: “The following features are not present in the object: Ccdc101, Cirh1a, Hn1l, Cct6a, Atp5o, Apitd1, Stra13, Nt5c3l, Cpsf3l, D17Wsu104e, Rqcd1, Lrdd, Supt16h, Slmo2, Trpc2, Mrp63, Ubl4, Sip1, Ankrd32, Tubg2, 11-Sep, Dynlt1-ps1, Fam165b, Tmem48, Mlf1ip, Hn1, Th1l, SNORD93, Tssc1, D17H6S56E-5, Rad51l1, Sgol1, Casc5, Fam64a, D2Ertd750e, Fam54a, Hist1h4m, not searching for symbol synonyms”
qc_vln_FUN <- function(data, y, fill, ylab = "", scale_y_log10, ymin, ymax) {
vln_plot_1 <- ggplot(data, aes(x = treatment, y = {{y}}, color = {{fill}})) +
geom_jitter(alpha = 0.2, shape = 16, color = "gray") +
geom_boxplot(alpha = 1.0) + xlab("") + ylim(ymin, ymax) +
scale_color_manual(values = so_color$tissue) +
ggtitle("Cell containing GEM") +
facet_wrap(~tissue, scales = "free_x") + ylab(ylab) +
theme(
plot.title = element_text(size = 12, face = "bold", margin = margin(t = 0, r = 0, b = 5, l = 0)),
axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1),
strip.text = element_blank()
)
vln_plot_2 <- ggplot(data[data$qc_class == "pass", ], aes(x = treatment, y = {{y}}, color = {{fill}})) +
geom_jitter(alpha = 0.2, shape = 16, color = "gray") +
geom_boxplot(alpha = 1.0) + xlab("") + ylim(ymin, ymax) +
ylim(ymin, ymax) +
scale_color_manual(values = so_color$tissue) +
ggtitle("Filtered") +
facet_wrap(~tissue, scales = "free_x") + ylab(ylab) +
theme(
plot.title = element_text(size = 12, face = "bold", margin = margin(t = 0, r = 0, b = 5, l = 0)),
axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1),
strip.text = element_blank()
)
if(scale_y_log10) {
vln_plot_1 <- vln_plot_1 + scale_y_log10(limits = c(ymin, NA))
vln_plot_2 <- vln_plot_2 + scale_y_log10(limits = c(ymin, NA))
}
vln_plot <- vln_plot_1 + vln_plot_2 + plot_layout(ncol = 2, guides = "collect") & theme(legend.position = "bottom")
return(vln_plot)
}
qc_vln_1 <- qc_vln_FUN(so_raw@meta.data, nCount_RNA, tissue, "UMI [count]", FALSE, ymin = 0, ymax = max(so_raw$nCount_RNA))
qc_vln_2 <- qc_vln_FUN(so_raw@meta.data, nFeature_RNA, tissue, "Feature [count]", FALSE, ymin = 0, ymax = max(so_raw$nFeature_RNA))
qc_vln_3 <- qc_vln_FUN(so_raw@meta.data, msCC_diff_RNA, tissue, "cc diff [count]", FALSE, ymin = 0, ymax = max(so_raw$nFeature_RNA))
qc_vln_4 <- qc_vln_FUN(so_raw@meta.data, pMt_RNA, tissue, "Mt [%]", FALSE, ymin = 0, ymax = 100)
qc_vln_5 <- qc_vln_FUN(so_raw@meta.data, pHb_RNA, tissue, "Hb [%]", FALSE, ymin = 0, ymax = 100)
qc_vln_6 <- qc_vln_FUN(so_raw@meta.data, pRp_RNA, tissue, "Rbl [%]", FALSE, ymin = 0, ymax = 100)
options(repr.plot.width = 15, repr.plot.height = 10)
qc_vln_1 + qc_vln_2 + qc_vln_3 + qc_vln_4 + qc_vln_5 + qc_vln_6 + plot_layout(ncol = 3, nrow = 2) + plot_layout(guides = "collect") & theme(legend.position = "bottom")
ggsave(qc_vln_1, filename = "result/plot/seurat/qc_vln_1.png", width = 5, height = 2.5)
ggsave(qc_vln_2, filename = "result/plot/seurat/qc_vln_2.png", width = 5, height = 2.5)
ggsave(qc_vln_3, filename = "result/plot/seurat/qc_vln_3.png", width = 5, height = 2.5)
ggsave(qc_vln_4, filename = "result/plot/seurat/qc_vln_4.png", width = 5, height = 2.5)
ggsave(qc_vln_5, filename = "result/plot/seurat/qc_vln_5.png", width = 5, height = 2.5)
ggsave(qc_vln_6, filename = "result/plot/seurat/qc_vln_6.png", width = 5, height = 2.5)
Scale for 'y' is already present. Adding another scale for 'y', which will
replace the existing scale.
Scale for 'y' is already present. Adding another scale for 'y', which will
replace the existing scale.
Scale for 'y' is already present. Adding another scale for 'y', which will
replace the existing scale.
Scale for 'y' is already present. Adding another scale for 'y', which will
replace the existing scale.
Scale for 'y' is already present. Adding another scale for 'y', which will
replace the existing scale.
Scale for 'y' is already present. Adding another scale for 'y', which will
replace the existing scale.
Warning message:
“Removed 244 rows containing missing values (geom_point).”
Warning message:
“Removed 277 rows containing missing values (geom_point).”
Warning message:
“Removed 212908 rows containing non-finite values (stat_boxplot).”
Warning message:
“Removed 219160 rows containing missing values (geom_point).”
Warning message:
“Removed 4415 rows containing non-finite values (stat_boxplot).”
Warning message:
“Removed 4415 rows containing missing values (geom_point).”
Warning message:
“Removed 519 rows containing non-finite values (stat_boxplot).”
Warning message:
“Removed 1217928 rows containing missing values (geom_point).”
Warning message:
“Removed 4 rows containing missing values (geom_point).”
Warning message:
“Removed 519 rows containing non-finite values (stat_boxplot).”
Warning message:
“Removed 1231281 rows containing missing values (geom_point).”
Warning message:
“Removed 3935 rows containing missing values (geom_point).”
Warning message:
“Removed 519 rows containing non-finite values (stat_boxplot).”
Warning message:
“Removed 1069553 rows containing missing values (geom_point).”
ERROR while rich displaying an object: Error: The given dimensions cannot hold all panels. Please increase `ncol` or `nrow`
Traceback:
1. FUN(X[[i]], ...)
2. tryCatch(withCallingHandlers({
. if (!mime %in% names(repr::mime2repr))
. stop("No repr_* for mimetype ", mime, " in repr::mime2repr")
. rpr <- repr::mime2repr[[mime]](obj)
. if (is.null(rpr))
. return(NULL)
. prepare_content(is.raw(rpr), rpr)
. }, error = error_handler), error = outer_handler)
3. tryCatchList(expr, classes, parentenv, handlers)
4. tryCatchOne(expr, names, parentenv, handlers[[1L]])
5. doTryCatch(return(expr), name, parentenv, handler)
6. withCallingHandlers({
. if (!mime %in% names(repr::mime2repr))
. stop("No repr_* for mimetype ", mime, " in repr::mime2repr")
. rpr <- repr::mime2repr[[mime]](obj)
. if (is.null(rpr))
. return(NULL)
. prepare_content(is.raw(rpr), rpr)
. }, error = error_handler)
7. repr::mime2repr[[mime]](obj)
8. repr_text.default(obj)
9. paste(capture.output(print(obj)), collapse = "\n")
10. capture.output(print(obj))
11. withVisible(...elt(i))
12. print(obj)
13. print.patchwork(obj)
14. build_patchwork(plot, plot$layout$guides %||% "auto")
15. wrap_dims(length(gt), nrow = x$layout$nrow, ncol = x$layout$ncol)
16. abort("The given dimensions cannot hold all panels. Please increase `ncol` or `nrow`")
17. signal_abort(cnd)
Warning message:
“Removed 265 rows containing missing values (geom_point).”
Warning message:
“Removed 283 rows containing missing values (geom_point).”
Warning message:
“Removed 212908 rows containing non-finite values (stat_boxplot).”
Warning message:
“Removed 219022 rows containing missing values (geom_point).”
Warning message:
“Removed 4415 rows containing non-finite values (stat_boxplot).”
Warning message:
“Removed 4415 rows containing missing values (geom_point).”
Warning message:
“Removed 519 rows containing non-finite values (stat_boxplot).”
Warning message:
“Removed 1220495 rows containing missing values (geom_point).”
Warning message:
“Removed 2 rows containing missing values (geom_point).”
Warning message:
“Removed 519 rows containing non-finite values (stat_boxplot).”
Warning message:
“Removed 1231262 rows containing missing values (geom_point).”
Warning message:
“Removed 3948 rows containing missing values (geom_point).”
Warning message:
“Removed 519 rows containing non-finite values (stat_boxplot).”
Warning message:
“Removed 1070312 rows containing missing values (geom_point).”
so_qc <- CellCycleScoring(so_qc, s.features = cc_kowalczyk_s$gene, g2m.features = cc_kowalczyk_g2m$gene, set.ident = FALSE)
colnames(so_qc@meta.data) <- gsub("Phase", "cc_phase_class", colnames(so_qc@meta.data))
colnames(so_qc@meta.data) <- gsub("S.Score", "msS_RNA", colnames(so_qc@meta.data))
colnames(so_qc@meta.data) <- gsub("G2M.Score", "msInt_RNA", colnames(so_qc@meta.data))
so_qc$msCC_diff_RNA <- so_qc$msS_RNA - so_qc$msInt_RNA
so_qc <- AddModuleScore(so_qc, features = list(cc_kowalczyk_g0$gene), assays = "RNA", slot = "data", ctrl = 100, nbin = 25, name = "msG0_RNA")
so_qc <- AddModuleScore(so_qc, features = list(cc_kowalczyk_int$gene), assays = "RNA", slot = "data", ctrl = 100, nbin = 25, name = "msInt_RNA")
Warning message: “The following features are not present in the object: Ccdc101, Cirh1a, Hn1l, Cct6a, Atp5o, Apitd1, Stra13, Nt5c3l, Cpsf3l, D17Wsu104e, Rqcd1, Lrdd, Supt16h, Slmo2, Trpc2, Mrp63, Ubl4, Sip1, Ankrd32, Tubg2, 11-Sep, Dynlt1-ps1, Fam165b, Tmem48, Mlf1ip, not searching for symbol synonyms” Warning message: “The following features are not present in the object: Hn1, Th1l, SNORD93, Tssc1, D17H6S56E-5, Rad51l1, Sgol1, Casc5, Fam64a, D2Ertd750e, Fam54a, Hist1h4m, not searching for symbol synonyms” Warning message: “The following features are not present in the object: Pisd-ps2, Ces2f, Pi15, Dcdc2c, Prl8a1, Rdh9, Zfp71-rs1, Vmn1r58, Zfp488, Unc13c, Abca12, Gjc3, Olfr613, Gm11033, Gm10125, Mir1895, A730017L22Rik, Gm13589, Gm14411, A530072M11Rik, Kirrel3, Gbp1, Nox4, AC125373.1, Gm10311, Acss3, Pisd-ps1, not searching for symbol synonyms” Warning message: “The following features are not present in the object: Ccdc101, Cirh1a, Hn1l, Cct6a, Atp5o, Apitd1, Stra13, Nt5c3l, Cpsf3l, D17Wsu104e, Rqcd1, Lrdd, Supt16h, Slmo2, Trpc2, Mrp63, Ubl4, Sip1, Ankrd32, Tubg2, 11-Sep, Dynlt1-ps1, Fam165b, Tmem48, Mlf1ip, Hn1, Th1l, SNORD93, Tssc1, D17H6S56E-5, Rad51l1, Sgol1, Casc5, Fam64a, D2Ertd750e, Fam54a, Hist1h4m, not searching for symbol synonyms”
so_qc <- NormalizeData(
object = so_qc,
assay = "RNA",
normalization.method = "LogNormalize",
scale.factor = 10000
)
so_qc <- FindVariableFeatures(
object = so_qc,
assay = "RNA",
selection.method = "vst",
nfeatures = 2000
)
# variable_features <- VariableFeatures(so_qc)
# VariableFeatures(so_qc) <- variable_features[!variable_features %in% c(cc_kowalczyk$gene)]
Regress nCount_RNA (total UMI per cell) and pMT_RNA (mitochondrial UMI percentage)
so_qc <- ScaleData(
object = so_qc,
assay = "RNA",
vars.to.regress = c("nCount_RNA", "pMt_RNA", "msCC_diff_RNA"),
do.scale = TRUE,
do.center = TRUE
)
Regressing out nCount_RNA, pMt_RNA, msCC_diff_RNA Centering and scaling data matrix
source("script/seurat/seurat_function.R")
so_qc <- dim_clust(
# Seurat
so = so_qc,
assay = "RNA",
# PCA features
blacklist_genes = NULL,
# Dim reduction
dims_pca = 30, # Default 20 - RunPCA dims
dims_umap = 30, # Default NULL - RunUMAP dims
dims_tsne = 20, # Default 1:5
min_dist = 0.3, # Default 0.3 - RunUMAP dmin.ist - controls how tightly the embedding
# Cluster
dims_cluster = 30, # Default 1:10 - FindNeighbors dims
cluster_res = 0.8 # Default 0.8 - FindClusters resoluton - above(below) 1.0 to obtain larger(smaller) number of communities)
)
PC_ 1 Positive: Tmsb4x, Tyrobp, Cd52, Fcer1g, Ptprc, Laptm5, Cyba, Cd74, Coro1a, Cst3 Ctss, Lcp1, Lsp1, Sh3bgrl3, Psap, Malat1, Napsa, Gm2a, Lst1, Actb Arpc1b, Gngt2, Sat1, H2-Aa, Rgs2, Fyb, Spi1, Pou2f2, Gm42418, Cybb Negative: Car2, Tubb4b, Blvrb, Prdx2, Glrx5, Dut, Hmbs, Hba-a1, Hbb-bs, Stmn1 Car1, Hbb-bt, Mki67, Cd24a, Ran, Cpox, Alad, Ctse, Birc5, Rps2 Rhd, Hba-a2, Ybx3, Odc1, Mgst3, C1qtnf12, Rrm2, Hsp90aa1, Hemgn, Tmem14c PC_ 2 Positive: Cebpb, Ace, Lyz2, Msrb1, Csf1r, Itgal, Fcgr4, Adgre4, Cx3cr1, Cd300ld Smpdl3a, Ceacam1, Ctsb, Clec4a3, Cybb, Fabp4, Spn, Stk10, Pou2f2, Cd300e Cyp4f18, Tnfrsf1b, Tppp3, Smpdl3b, Nupr1, Lrp1, Ear2, Klf4, Pilra, Gngt2 Negative: H2-Eb1, Dnase1l3, H2-Ab1, Ciita, H2-Aa, Ccnd1, H2-Oa, S100a11, Xcr1, Wdfy4 Tbc1d4, Slamf7, Cd74, H2-DMa, Ptms, Id2, Batf3, Cd8a, Adam23, Rab7b Jaml, Clec9a, Asb2, H2-DMb1, Flt3, Avpi1, Ppp1r14a, H2-DMb2, Ffar2, Dock10 PC_ 3 Positive: Plac8, Coro1a, Emp3, Napsa, Srgn, Ifitm2, Pglyrp1, Arhgdib, Alox5ap, Spn Lsp1, Itgal, S100a10, Gmfg, Vim, Sh3bgrl3, Tmsb10, Cyp4f18, Stap1, Cbfa2t3 S100a11, Ptpre, S100a4, S100a6, Ace, Fxyd5, Klf2, Eno3, Lyz2, Ifitm3 Negative: Vcam1, C1qc, C1qa, C1qb, Fcna, Cd5l, Slc40a1, Mertk, Igf1, Axl Maf, Mrc1, Mafb, Itgad, Hmox1, Spic, Ccr3, Slpi, Ptgs1, Kcna2 Epb41l3, Frmd4b, Hpgds, Sema6d, Sdc3, Kcnj10, Clec4n, Gfra2, Pld3, Abcc3 PC_ 4 Positive: Gata2, Cd63, S100a10, Nkg7, Cpa3, Ms4a2, Cmtm7, Fcer1a, Ifitm1, Mcpt8 Myb, Tagln2, Ptprcap, Hdc, Cd200r3, Csrp3, Ctsg, Cyp11a1, Cdk6, Ccl3 Sox4, Lgals1, Serpinb1a, Rab44, Ikzf2, Prss34, Csf1, Ms4a3, Gm15915, Igf1r Negative: Hbb-bt, Hba-a1, Ctse, Hbb-bs, Cd24a, Hba-a2, Mgst3, Gypa, Rhd, Hmbs Cldn13, Tmem14c, Slc4a1, Hemgn, Cpox, Tfrc, Hebp1, Hagh, Smim1, Prdx2 Rrm2, Alad, Car2, Slc25a37, Glrx5, Rhag, Blvrb, Lmna, H2-Eb1, Stard10 PC_ 5 Positive: S100a4, Ccnd1, Ccl5, Gm15915, Ffar2, Plxnc1, Mt1, Relb, Dtx1, Siglecg Mt2, Ltb, Tmem176b, Npm3, H2-DMb2, Srm, Rangrf, Ppp1r14a, Myb, Tmem176a Cyp4f16, Traf1, Npm1, Bcl2a1b, Hspd1, Arap2, S100a6, Avpi1, St8sia6, Il4i1 Negative: Xcr1, Naaa, Rab7b, Cd8a, Irf8, Clec9a, Ppt1, P2ry14, Tlr3, A530099J19Rik Cadm1, AC163354.1, Ifi205, Pianp, Lgals3, Itgae, Pdia5, Gcsam, Mpeg1, Ms4a2 Cd200r3, Fcer1a, Cyp11a1, Sept3, Cpa3, Dbn1, Ccl3, Mcpt8, Hdc, Tlr11 Computing nearest neighbor graph Computing SNN Computing nearest neighbors Only one graph name supplied, storing nearest-neighbor graph only Warning message: “The following arguments are not used: assay” Warning message: “The following arguments are not used: assay”
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck Number of nodes: 22290 Number of edges: 772251 Running Louvain algorithm... Maximum modularity in 10 random starts: 0.8802 Number of communities: 21 Elapsed time: 2 seconds
Warning message: “The default method for RunUMAP has changed from calling Python UMAP via reticulate to the R-native UWOT using the cosine metric To use Python UMAP via reticulate, set umap.method to 'umap-learn' and metric to 'correlation' This message will be shown once per session” 08:51:14 UMAP embedding parameters a = 0.9922 b = 1.112 08:51:15 Commencing smooth kNN distance calibration using 1 thread 08:51:18 Initializing from normalized Laplacian + noise 08:51:19 Commencing optimization for 200 epochs, with 610058 positive edges 08:51:43 Optimization finished 08:51:43 UMAP embedding parameters a = 0.9922 b = 1.112 08:51:43 Read 22290 rows and found 30 numeric columns 08:51:43 Using Annoy for neighbor search, n_neighbors = 20 08:51:43 Building Annoy index with metric = euclidean, n_trees = 50 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * | 08:51:45 Writing NN index file to temp file /tmp/RtmpPiamd8/file1b0236ef5399 08:51:45 Searching Annoy index using 1 thread, search_k = 2000 08:51:49 Annoy recall = 100% 08:51:50 Commencing smooth kNN distance calibration using 1 thread 08:51:53 Initializing from normalized Laplacian + noise 08:51:54 Commencing optimization for 200 epochs, with 611186 positive edges 08:52:18 Optimization finished Warning message: “Cannot add objects with duplicate keys (offending key: UMAP_), setting key to 'rna_umap_'”
SingleR identifies marker genes from the reference and uses them to compute assignment scores (based on the Spearman correlation across markers) for each cell in the test dataset against each label in the reference. The label with the highest score is the assigned to the test cell, possibly with further fine-tuning to resolve closely related labels.
first.labels: Labels before fine-tuning
labels: Labels after fine-tuning
pruning: labels after pruning
#load celldex Immgen reference
ref <- celldex::ImmGenData(ensembl = FALSE)
# Seurat to SingleCellExperiment
sce <- SingleCellExperiment(list(counts = so_qc@assays$RNA@counts))
# Predict labels
label_main <- SingleR::SingleR(test = sce, ref = ref, labels = ref$label.main, assay.type.test = "counts", de.method = "classic")
label_fine <- SingleR::SingleR(test = sce, ref = ref, labels = ref$label.fine, assay.type.test = "counts", de.method = "classic")
# Add labels to Seurat object
label_main_meta <- as.data.frame(label_main) %>% dplyr::select(pruned.labels, tuning.scores.first) %>% dplyr::rename(main_labels = pruned.labels, main_delta_score = tuning.scores.first)
so_qc <- AddMetaData(so_qc, label_main_meta)
so_qc$main_labels <- factor(so_qc$main_labels, levels = names(so_color$main_labels))
label_fine_meta <- as.data.frame(label_fine) %>% dplyr::select(pruned.labels, tuning.scores.first) %>% dplyr::rename(fine_labels = pruned.labels, fine_delta_score = tuning.scores.first)
so_qc <- AddMetaData(so_qc, label_fine_meta)
# so_qc$fine_labels <- factor(so_qc$main_labels, levels = names(so_color$main_labels)) # Dont have a sort yet
snapshotDate(): 2021-05-18
see ?celldex and browseVignettes('celldex') for documentation
loading from cache
see ?celldex and browseVignettes('celldex') for documentation
loading from cache
# Dim and Feature plots
reduction <- "rna_umap_nno"
dplot_1 <- DimPlot(so_qc, reduction = reduction, group.by = "rna_snn_res.0.8", label = TRUE) &
theme(aspect.ratio = 1, legend.position = "none")
dplot_2 <- DimPlot(so_qc, reduction = reduction, group.by = "main_labels", label = FALSE) &
theme(aspect.ratio = 1, legend.position = "bottom") &
scale_color_manual(values = so_color$main_labels, na.value = "dark gray") &
guides(color = guide_legend(ncol = 3, override.aes = list(size = 2)))
dplot_3 <- DimPlot(so_qc, reduction = reduction, group.by = "tissue", label = FALSE) &
theme(aspect.ratio = 1, legend.position = "bottom") &
scale_color_manual(values = so_color$tissue, na.value = "dark gray") &
guides(color = guide_legend(ncol = 3, override.aes = list(size = 2)))
dplot_4 <- DimPlot(so_qc, reduction = reduction, group.by = "treatment", label = FALSE) &
theme(aspect.ratio = 1, legend.position = "bottom") &
scale_color_manual(values = so_color$treatment, na.value = "dark gray") &
guides(color = guide_legend(ncol = 3, override.aes = list(size = 2)))
options(repr.plot.width = 10, repr.plot.height = 10)
dplot <- dplot_1 + dplot_2 + dplot_3 + dplot_4 + plot_layout(ncol = 2)
dplot
ggsave(dplot, filename = "result/plot/seurat/dimplot_1.png", width = 9, height = 9)
fplot_1 <- FeaturePlot(so_qc, reduction = reduction, features = "nCount_RNA") & theme(aspect.ratio = 1)
fplot_2 <- FeaturePlot(so_qc, reduction = reduction, features = "nFeature_RNA") & theme(aspect.ratio = 1)
fplot_3 <- FeaturePlot(so_qc, reduction = reduction, features = "main_delta_score") & theme(aspect.ratio = 1)
options(repr.plot.width = 15, repr.plot.height = 5)
fplot <- fplot_1 + fplot_2 + fplot_3 + plot_layout(ncol = 3)
fplot
ggsave(fplot, filename = "result/plot/seurat/fplot_1.png", width = 9, height = 6)
fplot_1 <- FeaturePlot(so_qc, reduction = reduction, features = "pMt_RNA") & theme(aspect.ratio = 1)
fplot_2 <- FeaturePlot(so_qc, reduction = reduction, features = "pRp_RNA") & theme(aspect.ratio = 1)
fplot_3 <- FeaturePlot(so_qc, reduction = reduction, features = "pHb_RNA") & theme(aspect.ratio = 1)
options(repr.plot.width = 15, repr.plot.height = 5)
fplot <- fplot_1 + fplot_2 + fplot_3 + plot_layout(ncol = 3)
fplot
ggsave(fplot, filename = "result/plot/seurat/fplot_2.png", width = 9, height = 6)
fplot_1 <- FeaturePlot(so_qc, reduction = reduction, features = "msG0_RNA1") & theme(aspect.ratio = 1)
fplot_2 <- FeaturePlot(so_qc, reduction = reduction, features = "msInt_RNA1") & theme(aspect.ratio = 1)
fplot_3 <- FeaturePlot(so_qc, reduction = reduction, features = "msCC_diff_RNA") & theme(aspect.ratio = 1)
options(repr.plot.width = 15, repr.plot.height = 5)
fplot <- fplot_1 + fplot_2 + fplot_3 + plot_layout(ncol = 3)
fplot
ggsave(fplot, filename = "result/plot/seurat/fplot_3.png", width = 9, height = 6)
cluster_tissue <- ggplot(so_qc@meta.data, aes(x = seurat_clusters, fill = tissue)) +
geom_bar(stat = "count", position = "fill") +
scale_fill_manual(values = so_color$tissue) +
ggtitle("Cluster frequency") + xlab("Cluster") + ylab("Cell frequency") +
theme(legend.position = "bottom")
cluster_treatment <- ggplot(so_qc@meta.data, aes(x = seurat_clusters, fill = treatment)) +
geom_bar(stat = "count", position = "fill") +
scale_fill_manual(values = so_color$treatment) +
ggtitle("Cluster frequency") + xlab("Cluster") + ylab("Cell frequency") +
theme(legend.position = "bottom")
options(repr.plot.width = 10, repr.plot.height = 5)
cluster_tissue + cluster_treatment + plot_layout(ncol = 2)
ggsave(cluster_tissue, filename = "result/plot/seurat/cluster_tissue.png", width = 6, height = 3)
ggsave(cluster_treatment, filename = "result/plot/seurat/cluster_treatment.png", width = 6, height = 3)
bar_1 <- ggplot(so_qc@meta.data, aes(x = seurat_clusters, y = pHb_RNA)) + geom_boxplot()
bar_2 <- ggplot(so_qc@meta.data, aes(x = seurat_clusters, y = pMt_RNA)) + geom_boxplot()
bar_3 <- ggplot(so_qc@meta.data, aes(x = seurat_clusters, y = pRp_RNA)) + geom_boxplot()
options(repr.plot.width = 10, repr.plot.height = 10)
bar_1 + bar_2 + bar_3 + plot_layout(ncol = 2)
fplot_1 <- FeaturePlot(so_qc, reduction = reduction, features = "Ptprc") & theme(aspect.ratio = 1) & ggtitle("Ptprc (CD45)")
fplot_2 <- FeaturePlot(so_qc, reduction = reduction, features = "Cd34") & theme(aspect.ratio = 1) & ggtitle("Cd34")
fplot_3 <- FeaturePlot(so_qc, reduction = reduction, features = "Kit") & theme(aspect.ratio = 1) & ggtitle("Kit")
fplot_4 <- FeaturePlot(so_qc, reduction = reduction, features = "Tfrc") & theme(aspect.ratio = 1) & ggtitle("Tfrc (CD71)")
fplot_5 <- FeaturePlot(so_qc, reduction = reduction, features = "Gata2") & theme(aspect.ratio = 1) & ggtitle("Gata2")
fplot_6 <- FeaturePlot(so_qc, reduction = reduction, features = "Gata1") & theme(aspect.ratio = 1) & ggtitle("Gata1")
fplot <- fplot_1 + fplot_2 + fplot_3 + fplot_4 + fplot_5 + fplot_6 + plot_layout(ncol = 3)
options(repr.plot.width = 15, repr.plot.height = 10)
fplot
ggsave(fplot, filename = "result/plot/seurat/fplot_2.png", width = 9, height = 6)
H2-K1 - Class I histocompatibility antigen, kappa-B alpha chain
H2-D1 - Class I histocompatibility antigen, D-B alpha chain
H2-L1 - not found
H2-Aa - Class II histocompatibility antigen, A-B alpha chain
H2-Ab1 - Class II histocompatibility antigen, A beta chain
H2-Eb1 - Class II histocompatibility antigen, I-E beta chain
H2-Eb2 - Class II histocompatibility antigen, E beta chain
mhcI_genes <- c("H2-K1", "H2-D1")
mhcII_genes <- c("H2-Aa", "H2-Ab1", "H2-Eb1", "H2-Eb2")
cd4_genes <- c("Cd4")
cd8_genes <- c("Cd8a")
so_qc <- AddModuleScore(so_qc, features = list(mhcI_genes), assays = "RNA", slot = "data", ctrl = 100, nbin = 25, name = "msMHCI_RNA")
so_qc <- AddModuleScore(so_qc, features = list(mhcII_genes), assays = "RNA", slot = "data", ctrl = 100, nbin = 25, name = "msMHCII_RNA")
so_qc <- AddModuleScore(so_qc, features = list(cd4_genes), assays = "RNA", slot = "data", ctrl = 100, nbin = 25, name = "msCd4_RNA")
so_qc <- AddModuleScore(so_qc, features = list(cd8_genes), assays = "RNA", slot = "data", ctrl = 100, nbin = 25, name = "msCd8_RNA")
fplot_mhcI <- FeaturePlot(so_qc, reduction = reduction, features = "msMHCI_RNA1") & theme(aspect.ratio = 1) & ggtitle("MHC class I")
fplot_mhcII <- FeaturePlot(so_qc, reduction = reduction, features = "msMHCII_RNA1") & theme(aspect.ratio = 1) & ggtitle("MHC class II")
fplot_cd4 <- FeaturePlot(so_qc, reduction = reduction, features = "msCd4_RNA1") & theme(aspect.ratio = 1) & ggtitle("CD4")
fplot_cd8 <- FeaturePlot(so_qc, reduction = reduction, features = "msCd8_RNA1") & theme(aspect.ratio = 1) & ggtitle("CD8")
fplot_mhc <- fplot_mhcI + fplot_mhcII + fplot_cd4 + fplot_cd8 + plot_layout(ncol = 2)
options(repr.plot.width = 15, repr.plot.height = 10)
fplot_mhc
ggsave(fplot_mhc, filename = "result/plot/seurat/fplot_mhc.png", width = 6, height = 6)
Trac - T cell receptor alpha constant
Trbc1 - T cell receptor beta constant 1
Trbc2 - T cell receptor beta constant 2
Trdc - T cell receptor delta constant
Trgc1 - T cell receptor gamma constant 1
Trgc2 - T cell receptor gamma constant 2
Cd247 - T-cell surface glycoprotein Cd3 zeta chain (Cd3z)
Cd3g - T-cell surface glycoprotein Cd3 gamma chain
Cd3e - T-cell surface glycoprotein Cd3 epsilon chain
Cd3d - T-cell surface glycoprotein Cd3 delta chain
tcr_genes <- c("Trac", "Trbc1", "Trbc2", "Trdc", "Trgc1", "Trgc2")
tcr_cd3_genes <- c("Cd247", "Cd3g", "Cd3e", "Cd3d")
so_qc <- AddModuleScore(so_qc, features = list(tcr_genes), assays = "RNA", slot = "data", ctrl = 100, nbin = 25, name = "msTcr_RNA")
so_qc <- AddModuleScore(so_qc, features = list(tcr_cd3_genes), assays = "RNA", slot = "data", ctrl = 100, nbin = 25, name = "msTcr_cd3_RNA")
fplot_tcr <- FeaturePlot(so_qc, reduction = reduction, features = "msTcr_RNA1") & theme(aspect.ratio = 1) & ggtitle("TCR")
fplot_tcr_cd3 <- FeaturePlot(so_qc, reduction = reduction, features = "msTcr_cd3_RNA1") & theme(aspect.ratio = 1) & ggtitle("Cd247/Cd3 family")
fplot_tcr_cd3 <- fplot_tcr + fplot_tcr_cd3 + plot_layout(ncol = 2)
options(repr.plot.width = 10, repr.plot.height = 5)
fplot_tcr_cd3
ggsave(fplot_tcr_cd3, filename = "result/plot/seurat/fplot_tcr_cd3.png", width = 6, height = 3)
Warning message: “The following features are not present in the object: Trgc1, Trgc2, not searching for symbol synonyms”
Naive B-cells produce the following Ig classes:
Ighm - Immunoglobulin heavy constant mu (naive B-cells)
Ighd - Immunoglobulin heavy constant delta (naive B-cells)
Through isotope switching the following Ig classes can be produced:
Ighg1 - Immunoglobulin heavy constant gamma (Mouse with Igh1b have Igg2c isotope instead of Igg2a)
Ighg2a - Immunoglobulin heavy constant gamma (NA)
Ighg2b - Immunoglobulin heavy constant gamma
Ighg2c - Immunoglobulin heavy constant gamma
Ighg3 - Immunoglobulin heavy constant gamma
Igha - Immunoglobulin heavy constant alpha
Ighe - Immunoglobulin heavy constant epsilon (NA)
Igkc - Immunoglobulin kappa constant (light chain)
Iglc - Immunoglobulin lambda constant (light chain)
Ighm_genes <- c("Ighm")
Ighd_genes <- c("Ighd")
Ighg_genes <- c("Ighg1", "Ighg2a", "Ighg2b", "Ighg2c", "Ighg3")
Igha_genes <- c("Igha")
Ighe_genes <- c("Ighe")
Igkc_genes <- c("Igkc")
Iglc_genes <- c("Iglc1", "Iglc2", "Iglc3")
so_qc <- AddModuleScore(so_qc, features = list(Ighm_genes), assays = "RNA", slot = "data", ctrl = 100, nbin = 25, name = "msIghm_RNA")
so_qc <- AddModuleScore(so_qc, features = list(Ighd_genes), assays = "RNA", slot = "data", ctrl = 100, nbin = 25, name = "msIghd_RNA")
so_qc <- AddModuleScore(so_qc, features = list(Ighg_genes), assays = "RNA", slot = "data", ctrl = 100, nbin = 25, name = "msIghg_RNA")
so_qc <- AddModuleScore(so_qc, features = list(Igha_genes), assays = "RNA", slot = "data", ctrl = 100, nbin = 25, name = "msIgha_RNA")
#so_qc <- AddModuleScore(so_qc, features = list(Ighe_genes), assays = "RNA", slot = "data", ctrl = 100, nbin = 25, name = "msIghe_RNA")
so_qc <- AddModuleScore(so_qc, features = list(Igkc_genes), assays = "RNA", slot = "data", ctrl = 100, nbin = 25, name = "msIgkc_RNA")
so_qc <- AddModuleScore(so_qc, features = list(Iglc_genes), assays = "RNA", slot = "data", ctrl = 100, nbin = 25, name = "msIglc_RNA")
fplot_Ighm <- FeaturePlot(so_qc, reduction = reduction, features = "msIghm_RNA1") & theme(aspect.ratio = 1) & ggtitle("Ighm")
fplot_Ighd <- FeaturePlot(so_qc, reduction = reduction, features = "msIghd_RNA1") & theme(aspect.ratio = 1) & ggtitle("Ighd")
fplot_Ighg <- FeaturePlot(so_qc, reduction = reduction, features = "msIghg_RNA1") & theme(aspect.ratio = 1) & ggtitle("Ighg")
fplot_Igha <- FeaturePlot(so_qc, reduction = reduction, features = "msIgha_RNA1") & theme(aspect.ratio = 1) & ggtitle("Igha")
fplot_Igkc <- FeaturePlot(so_qc, reduction = reduction, features = "msIgkc_RNA1") & theme(aspect.ratio = 1) & ggtitle("Igkc")
fplot_Iglc <- FeaturePlot(so_qc, reduction = reduction, features = "msIglc_RNA1") & theme(aspect.ratio = 1) & ggtitle("Iglc")
fplot_ig <- fplot_Ighm + fplot_Ighd + fplot_Ighg + fplot_Igha + fplot_Igkc + fplot_Iglc + plot_layout(ncol = 3)
options(repr.plot.width = 15, repr.plot.height = 10)
fplot_ig
ggsave(fplot_ig, filename = "result/plot/seurat/fplot_ig.png", width = 9, height = 6)
Warning message: “The following features are not present in the object: Ighg2a, not searching for symbol synonyms”
# Write tsv file for each sample into 10x sample dir
for (sample_path in unique(so_qc$sample_path)) {
tsv_file <- paste0(sample_path, "/filtered_feature_bc_matrix/barcodes.qc.tsv")
print(tsv_file)
cell_id <- so_qc@meta.data[so_qc@meta.data$sample_path == sample_path, ]$cell_id
library(stringr)
write.table(x = cell_id,
file = tsv_file,
row.names = FALSE,
col.names = FALSE,
sep = "\t",
quote = FALSE)
}
[1] "data/BSA_0355_SM01_10x_SPLENO/OUT/COUNT/M1_CpG_Mac_transcriptome/filtered_feature_bc_matrix/barcodes.qc.tsv" [1] "data/BSA_0355_SM01_10x_SPLENO/OUT/COUNT/M1_CpG_Prog_transcriptome/filtered_feature_bc_matrix/barcodes.qc.tsv" [1] "data/BSA_0355_SM01_10x_SPLENO/OUT/COUNT/M2_CpG_Mac_transcriptome/filtered_feature_bc_matrix/barcodes.qc.tsv" [1] "data/BSA_0355_SM01_10x_SPLENO/OUT/COUNT/M2_CpG_Prog_transcriptome/filtered_feature_bc_matrix/barcodes.qc.tsv" [1] "data/BSA_0355_SM01_10x_SPLENO/OUT/COUNT/M6_control_Mac_transcriptome/filtered_feature_bc_matrix/barcodes.qc.tsv" [1] "data/BSA_0355_SM01_10x_SPLENO/OUT/COUNT/M6_control_Prog_transcriptome/filtered_feature_bc_matrix/barcodes.qc.tsv" [1] "data/BSA_0355_SM01_10x_SPLENO/OUT/COUNT/M8_control_Mac_transcriptome/filtered_feature_bc_matrix/barcodes.qc.tsv" [1] "data/BSA_0355_SM01_10x_SPLENO/OUT/COUNT/M8_control_Prog_transcriptome/filtered_feature_bc_matrix/barcodes.qc.tsv"
# Write Seurat object
saveRDS(so_qc, so_qc_file)
# Write loom object
as.loom(so_qc, filename = loom_qc_file, verbose = FALSE)
Adding col attribute CellID Adding col attribute nCount_RNA Adding col attribute nFeature_RNA Adding col attribute sample_name Adding col attribute tissue Adding col attribute treatment Adding col attribute sample_path Adding col attribute sample_dir Adding col attribute cellranger_class Adding col attribute cell_id Adding col attribute nCount_RNA_rank Adding col attribute pMt_RNA Adding col attribute pHb_RNA Adding col attribute pRp_RNA Adding col attribute nFeature_RNA_max Adding col attribute nFeature_RNA_min Adding col attribute nCount_RNA_max Adding col attribute nCount_RNA_min Adding col attribute pMt_RNA_max Adding col attribute qc_class Adding col attribute msS_RNA Adding col attribute msInt_RNA Adding col attribute cc_phase_class Adding col attribute msCC_diff_RNA Adding col attribute msG0_RNA1 Adding col attribute msInt_RNA1 Adding col attribute rna_snn_res.0.8 Adding col attribute seurat_clusters Adding col attribute main_labels Warning message in ds$write_low_level(robj): “During conversion, the following issues occured: H5T_CONV_EXCEPT_RANGE_LOW” Adding col attribute main_delta_score Adding col attribute fine_labels Adding col attribute fine_delta_score Adding col attribute msMHCI_RNA1 Adding col attribute msMHCII_RNA1 Adding col attribute msCd4_RNA1 Adding col attribute msCd8_RNA1 Adding col attribute msTcr_RNA1 Adding col attribute msTcr_cd3_RNA1 Adding col attribute msIghm_RNA1 Adding col attribute msIghd_RNA1 Adding col attribute msIghg_RNA1 Adding col attribute msIgha_RNA1 Adding col attribute msIgkc_RNA1 Adding col attribute msIglc_RNA1 Adding row attribute Gene
Class: loom
Filename: /research/peer/fdeckert/FD20200109SPLENO/data/loom_object/loom_qc.loom
Access type: H5F_ACC_RDWR
Listing:
name obj_type dataset.dims dataset.type_class
attrs H5I_GROUP <NA> <NA>
col_attrs H5I_GROUP <NA> <NA>
col_graphs H5I_GROUP <NA> <NA>
layers H5I_GROUP <NA> <NA>
matrix H5I_DATASET 22290 x 15864 H5T_FLOAT
row_attrs H5I_GROUP <NA> <NA>
row_graphs H5I_GROUP <NA> <NA>
sessionInfo()
R version 4.1.0 (2021-05-18) Platform: x86_64-conda-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core) Matrix products: default BLAS/LAPACK: /home/fdeckert/bin/miniconda3/envs/r.4.1.0-FD20200109SPLENO/lib/libopenblasp-r0.3.15.so locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets [8] methods base other attached packages: [1] stringr_1.4.0 patchwork_1.1.1 [3] RColorBrewer_1.1-2 ggplot2_3.3.3 [5] dplyr_1.0.6 celldex_1.2.0 [7] SingleR_1.6.1 SingleCellExperiment_1.14.1 [9] SummarizedExperiment_1.22.0 Biobase_2.52.0 [11] GenomicRanges_1.44.0 GenomeInfoDb_1.28.0 [13] IRanges_2.26.0 S4Vectors_0.30.0 [15] BiocGenerics_0.38.0 MatrixGenerics_1.4.0 [17] matrixStats_0.59.0 SeuratDisk_0.0.0.9019 [19] SeuratObject_4.0.1 Seurat_4.0.2 loaded via a namespace (and not attached): [1] uuid_0.1-4 AnnotationHub_3.0.0 [3] BiocFileCache_2.0.0 plyr_1.8.6 [5] igraph_1.2.6 repr_1.1.3 [7] lazyeval_0.2.2 splines_4.1.0 [9] BiocParallel_1.26.0 listenv_0.8.0 [11] scattermore_0.7 digest_0.6.27 [13] htmltools_0.5.1.1 fansi_0.5.0 [15] memoise_2.0.0 magrittr_2.0.1 [17] ScaledMatrix_1.0.0 tensor_1.5 [19] cluster_2.1.2 ROCR_1.0-11 [21] Biostrings_2.60.0 globals_0.14.0 [23] spatstat.sparse_2.0-0 colorspace_2.0-1 [25] rappdirs_0.3.3 blob_1.2.1 [27] ggrepel_0.9.1 crayon_1.4.1 [29] RCurl_1.98-1.3 jsonlite_1.7.2 [31] spatstat.data_2.1-0 survival_3.2-11 [33] zoo_1.8-9 glue_1.4.2 [35] polyclip_1.10-0 gtable_0.3.0 [37] zlibbioc_1.38.0 XVector_0.32.0 [39] leiden_0.3.8 DelayedArray_0.18.0 [41] BiocSingular_1.8.0 future.apply_1.7.0 [43] abind_1.4-5 scales_1.1.1 [45] DBI_1.1.1 miniUI_0.1.1.1 [47] Rcpp_1.0.6 viridisLite_0.4.0 [49] xtable_1.8-4 reticulate_1.20 [51] spatstat.core_2.1-2 bit_4.0.4 [53] rsvd_1.0.5 htmlwidgets_1.5.3 [55] httr_1.4.2 ellipsis_0.3.2 [57] ica_1.0-2 farver_2.1.0 [59] pkgconfig_2.0.3 dbplyr_2.1.1 [61] uwot_0.1.10 deldir_0.2-10 [63] utf8_1.2.1 labeling_0.4.2 [65] AnnotationDbi_1.54.0 tidyselect_1.1.1 [67] rlang_0.4.11 reshape2_1.4.4 [69] later_1.2.0 BiocVersion_3.13.1 [71] cachem_1.0.5 munsell_0.5.0 [73] tools_4.1.0 cli_2.5.0 [75] ExperimentHub_2.0.0 RSQLite_2.2.5 [77] generics_0.1.0 ggridges_0.5.3 [79] evaluate_0.14 fastmap_1.1.0 [81] yaml_2.2.1 goftest_1.2-2 [83] bit64_4.0.5 fitdistrplus_1.1-5 [85] purrr_0.3.4 RANN_2.6.1 [87] KEGGREST_1.32.0 pbapply_1.4-3 [89] future_1.21.0 nlme_3.1-152 [91] sparseMatrixStats_1.4.0 mime_0.10 [93] hdf5r_1.3.3 compiler_4.1.0 [95] interactiveDisplayBase_1.30.0 filelock_1.0.2 [97] curl_4.3.1 plotly_4.9.3 [99] png_0.1-7 spatstat.utils_2.1-0 [101] tibble_3.1.2 stringi_1.6.2 [103] RSpectra_0.16-0 lattice_0.20-44 [105] IRdisplay_1.0 Matrix_1.3-4 [107] vctrs_0.3.8 pillar_1.6.1 [109] lifecycle_1.0.0 BiocManager_1.30.15 [111] spatstat.geom_2.1-0 lmtest_0.9-38 [113] RcppAnnoy_0.0.18 BiocNeighbors_1.10.0 [115] data.table_1.14.0 cowplot_1.1.1 [117] bitops_1.0-7 irlba_2.3.3 [119] httpuv_1.6.1 R6_2.5.0 [121] promises_1.2.0.1 KernSmooth_2.23-20 [123] gridExtra_2.3 parallelly_1.25.0 [125] codetools_0.2-18 MASS_7.3-54 [127] assertthat_0.2.1 withr_2.4.2 [129] sctransform_0.3.2 GenomeInfoDbData_1.2.6 [131] mgcv_1.8-36 grid_4.1.0 [133] rpart_4.1-15 beachmat_2.8.0 [135] IRkernel_1.2 tidyr_1.1.3 [137] DelayedMatrixStats_1.14.0 Rtsne_0.15 [139] pbdZMQ_0.3-5 shiny_1.6.0 [141] base64enc_0.1-3